A Better Decision Tree: The Max-Cut Decision Tree with Modified PCA Improves Accuracy and Running Time
نویسندگان
چکیده
Abstract Decision trees are a widely used method for classification, both alone and as the building blocks of multiple different ensemble learning methods. The Max Cut decision tree introduced here involves novel modifications to standard, baseline variant classification tree, CART Gini. One modification an alternative splitting metric, Cut, based on maximizing distance between all pairs observations that belong separate classes sides threshold value. other modification, Node Means PCA, selects feature from linear combination input features constructed using adjustment principal component analysis (PCA) locally at each node. Our experiments show this node-based, localized PCA with metric can dramatically improve accuracy while also significantly decreasing computational time compared Gini tree. These improvements most significant higher-dimensional datasets. For example dataset CIFAR-100, enabled 49% improvement in accuracy, relative Gini, providing $$6.8 \times$$ 6.8 × speed up Scikit-Learn implementation expected advance capabilities difficult tasks.
منابع مشابه
Decision Tree with Better Ranking
AUC (Area Under the Curve) of ROC (Receiver Operating Characteristics) has been recently used as a measure for ranking performance of learning algorithms. In this paper, we present a novel probability estimation algorithm that improves the AUC value of decision trees. Instead of estimating the probability at the single leaf where the example falls into, our method averages probability estimates...
متن کاملAn Algorithm for Better Decision Tree
The present paper aims at constructing the decision tree for a given database which adopts an improved ID3 decision tree algorithm to implement data mining in order to predict the output. The database is generated using the sampling techniques and the classification algorithm is applied on the samples. The obtained results are compared with experimental results in order to verify the validity a...
متن کاملPredicting Twist Condition by Bayesian Classification and Decision Tree Techniques
Railway infrastructures are among the most important national assets of countries. Most of the annual budget of infrastructure managers are spent on repairing, improving and maintaining railways. The best repair method should consider all economic and technical aspects of the problem. In recent years, data analysis of maintenance records has contributed significantly for minimizing the costs. B...
متن کاملP155: Differential Diagnosis of Panic Attacks: Using a Decision Tree
Panic attacks are discrete episodes of intense fear or discomfort accompanied by symptoms such as palpitations, shortness of breath, sweating, trembling, derealization and a fear of losing control or dying. Although panic attacks are required for a diagnosis of panic disorder, they also occur in association with a host of other disorders listed in the 5h version of the diagnostic and statistica...
متن کاملA Decision Tree for Technology Selection of Nitrogen Production Plants
Nitrogen is produced mainly from its most abundant source, the air, using three processes: membrane, pressure swing adsorption (PSA) and cryogenic. The most common method for evaluating a process is using the selection diagrams based on feasibility studies. Since the selection diagrams are presented by different companies, they are biased, and provide unsimilar and even controversial results. I...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SN computer science
سال: 2022
ISSN: ['2661-8907', '2662-995X']
DOI: https://doi.org/10.1007/s42979-022-01147-4